34 research outputs found
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
Developing tools in the context of autonomous systems [22, 24 ], such as
self-driving cars (SDCs), is time-consuming and costly since researchers and
practitioners rely on expensive computing hardware and simulation software. We
propose SensoDat, a dataset of 32,580 executed simulation-based SDC test cases
generated with state-of-the-art test generators for SDCs. The dataset consists
of trajectory logs and a variety of sensor data from the SDCs (e.g., rpm, wheel
speed, brake thermals, transmission, etc.) represented as a time series. In
total, SensoDat provides data from 81 different simulated sensors. Future
research in the domain of SDCs does not necessarily depend on executing
expensive test cases when using SensoDat. Furthermore, with the high amount and
variety of sensor data, we think SensoDat can contribute to research,
particularly for AI development, regression testing techniques for
simulation-based SDC testing, flakiness in simulation, etc. Link to the
dataset: https://doi.org/10.5281/zenodo.1030747
Diversity-guided Search Exploration for Self-driving Cars Test Generation through Frenet Space Encoding
The rise of self-driving cars (SDCs) presents important safety challenges to
address in dynamic environments. While field testing is essential, current
methods lack diversity in assessing critical SDC scenarios. Prior research
introduced simulation-based testing for SDCs, with Frenetic, a test generation
approach based on Frenet space encoding, achieving a relatively high percentage
of valid tests (approximately 50%) characterized by naturally smooth curves.
The "minimal out-of-bound distance" is often taken as a fitness function, which
we argue to be a sub-optimal metric. Instead, we show that the likelihood of
leading to an out-of-bound condition can be learned by the deep-learning
vanilla transformer model. We combine this "inherently learned metric" with a
genetic algorithm, which has been shown to produce a high diversity of tests.
To validate our approach, we conducted a large-scale empirical evaluation on a
dataset comprising over 1,174 simulated test cases created to challenge the
SDCs behavior. Our investigation revealed that our approach demonstrates a
substantial reduction in generating non-valid test cases, increased diversity,
and high accuracy in identifying safety violations during SDC test execution
Cost-effective Simulation-based Test Selection in Self-driving Cars Software
Simulation environments are essential for the continuous development of
complex cyber-physical systems such as self-driving cars (SDCs). Previous
results on simulation-based testing for SDCs have shown that many automatically
generated tests do not strongly contribute to identification of SDC faults,
hence do not contribute towards increasing the quality of SDCs. Because running
such "uninformative" tests generally leads to a waste of computational
resources and a drastic increase in the testing cost of SDCs, testers should
avoid them. However, identifying "uninformative" tests before running them
remains an open challenge. Hence, this paper proposes SDCScissor, a framework
that leverages Machine Learning (ML) to identify SDC tests that are unlikely to
detect faults in the SDC software under test, thus enabling testers to skip
their execution and drastically increase the cost-effectiveness of
simulation-based testing of SDCs software. Our evaluation concerning the usage
of six ML models on two large datasets characterized by 22'652 tests showed
that SDC-Scissor achieved a classification F1-score up to 96%. Moreover, our
results show that SDC-Scissor outperformed a randomized baseline in identifying
more failing tests per time unit.
Webpage & Video: https://github.com/ChristianBirchler/sdc-scisso
Machine Learning-based Test Selection for Simulation-based Testing of Self-driving Cars Software
Simulation platforms facilitate the development of emerging Cyber-Physical
Systems (CPS) like self-driving cars (SDC) because they are more efficient and
less dangerous than field operational test cases. Despite this, thoroughly
testing SDCs in simulated environments remains challenging because SDCs must be
tested in a sheer amount of long-running test cases. Past results on software
testing optimization have shown that not all the test cases contribute equally
to establishing confidence in test subjects' quality and reliability, and the
execution of "safe and uninformative" test cases can be skipped to reduce
testing effort. However, this problem is only partially addressed in the
context of SDC simulation platforms. In this paper, we investigate test
selection strategies to increase the cost-effectiveness of simulation-based
testing in the context of SDCs. We propose an approach called SDC-Scissor (SDC
coSt-effeCtIve teSt SelectOR) that leverages Machine Learning (ML) strategies
to identify and skip test cases that are unlikely to detect faults in SDCs
before executing them.
Our evaluation shows that SDC-Scissor outperforms the baselines. With the
Logistic model, we achieve an accuracy of 70%, a precision of 65%, and a recall
of 80% in selecting tests leading to a fault and improved testing
cost-effectiveness. Specifically, SDC-Scissor avoided the execution of 50% of
unnecessary tests as well as outperformed two baseline strategies.
Complementary to existing work, we also integrated SDC-Scissor into the context
of an industrial organization in the automotive domain to demonstrate how it
can be used in industrial settings.Comment: arXiv admin note: substantial text overlap with arXiv:2111.0466
How does Simulation-based Testing for Self-driving Cars match Human Perception?
Software metrics such as coverage and mutation scores have been extensively
explored for the automated quality assessment of test suites. While traditional
tools rely on such quantifiable software metrics, the field of self-driving
cars (SDCs) has primarily focused on simulation-based test case generation
using quality metrics such as the out-of-bound (OOB) parameter to determine if
a test case fails or passes. However, it remains unclear to what extent this
quality metric aligns with the human perception of the safety and realism of
SDCs, which are critical aspects in assessing SDC behavior. To address this
gap, we conducted an empirical study involving 50 participants to investigate
the factors that determine how humans perceive SDC test cases as safe, unsafe,
realistic, or unrealistic. To this aim, we developed a framework leveraging
virtual reality (VR) technologies, called SDC-Alabaster, to immerse the study
participants into the virtual environment of SDC simulators. Our findings
indicate that the human assessment of the safety and realism of failing and
passing test cases can vary based on different factors, such as the test's
complexity and the possibility of interacting with the SDC. Especially for the
assessment of realism, the participants' age as a confounding factor leads to a
different perception. This study highlights the need for more research on SDC
simulation testing quality metrics and the importance of human perception in
evaluating SDC behavior
SBFT Tool Competition 2024 -- Python Test Case Generation Track
Test case generation (TCG) for Python poses distinctive challenges due to the
language's dynamic nature and the absence of strict type information. Previous
research has successfully explored automated unit TCG for Python, with
solutions outperforming random test generation methods. Nevertheless,
fundamental issues persist, hindering the practical adoption of existing test
case generators. To address these challenges, we report on the organization,
challenges, and results of the first edition of the Python Testing Competition.
Four tools, namely UTBotPython, Klara, Hypothesis Ghostwriter, and Pynguin were
executed on a benchmark set consisting of 35 Python source files sampled from 7
open-source Python projects for a time budget of 400 seconds. We considered one
configuration of each tool for each test subject and evaluated the tools'
effectiveness in terms of code and mutation coverage. This paper describes our
methodology, the analysis of the results together with the competing tools, and
the challenges faced while running the competition experiments.Comment: 4 pages, to appear in the Proceedings of the 17th International
Workshop on Search-Based and Fuzz Testing (SBFT@ICSE 2024
TEASER: Simulation-based CAN Bus Regression Testing for Self-driving Cars Software
Software systems for safety-critical systems like self-driving cars (SDCs)
need to be tested rigorously. Especially electronic control units (ECUs) of
SDCs should be tested with realistic input data. In this context, a
communication protocol called Controller Area Network (CAN) is typically used
to transfer sensor data to the SDC control units. A challenge for SDC
maintainers and testers is the need to manually define the CAN inputs that
realistically represent the state of the SDC in the real world. To address this
challenge, we developed TEASER, which is a tool that generates realistic CAN
signals for SDCs obtained from sensors from state-of-the-art car simulators. We
evaluated TEASER based on its integration capability into a DevOps pipeline of
aicas GmbH, a company in the automotive sector. Concretely, we integrated
TEASER in a Continous Integration (CI) pipeline configured with Jenkins. The
pipeline executes the test cases in simulation environments and sends the
sensor data over the CAN bus to a physical CAN device, which is the test
subject. Our evaluation shows the ability of TEASER to generate and execute CI
test cases that expose simulation-based faults (using regression strategies);
the tool produces CAN inputs that realistically represent the state of the SDC
in the real world. This result is of critical importance for increasing
automation and effectiveness of simulation-based CAN bus regression testing for
SDC software. Tool: https://doi.org/10.5281/zenodo.7964890 GitHub:
https://github.com/christianbirchler-org/sdc-scissor/releases/tag/v2.2.0-rc.1
Documentation: https://sdc-scissor.readthedocs.i
Machine learning-based test selection for simulation-based testing of self-driving cars software
Simulation platforms facilitate the development of emerging Cyber-Physical Systems (CPS) like self-driving cars (SDC) because they are more efficient and less dangerous than eld operational test cases. Despite this, thoroughly testing SDCs in simulated environments remains challenging because SDCs must be tested in a sheer amount of long-running test cases. Past results on software testing optimization have shown that not all the test cases contribute equally to establishing con dence in test subjects' quality and reliability, and the execution of \safe and uninformative" test cases can be skipped to reduce testing effort. However, this problem is only partially addressed in the context of SDC simulation platforms. In this paper, we investigate test selection strategies to increase the cost-effectiveness of simulation-based testing in the context of SDCs. We propose an approach called SDC-Scissor (SDC coSt-effeCtIve teSt SelectOR) that leverages Machine Learning (ML) strategies to identify and skip test cases that are unlikely to detect faults in SDCs before executing them
Cost-effective simulation-based test selection in self-driving cars software with SDC-Scissor
Simulation platforms facilitate the continuous development of complex systems such as self-driving cars (SDCs). However, previous results on testing SDCs using simulations have shown that most of the automatically generated tests do not strongly contribute to establishing confidence in the quality and reliability of the SDC. Therefore, those tests can be characterized as “uninformative”, and running them generally means wasting precious computational resources. We address this issue with SDC-Scissor, a framework that leverages Machine Learning to identify simulation-based tests that are unlikely to detect faults in the SDC software under test and skip them before their execution. Consequently, by filtering out those tests, SDC-Scissor reduces the number of long-running simulations to execute and drastically increases the cost-effectiveness of simulation-based testing of SDCs software. Our evaluation concerning two large datasets and around 12’000 tests showed that SDC-Scissor achieved a higher classification F1-score (between 47% and 90%) than a randomized baseline in identifying tests that lead to a fault and reduced the time spent running uninformative tests (speedup between 107% and 170%). Webpage & Video: https://github.com/ChristianBirchler/sdc-scisso
Single and multi-objective test cases prioritization for self-driving cars in virtual environments
Testing with simulation environments helps to identify critical failing scenarios for self-driving cars (SDCs). Simulation-based tests are safer than in-field operational tests and allow detecting software defects before deployment. However, these tests are very expensive and are too many to be run frequently within limited time constraints. In this paper, we investigate test case prioritization techniques to increase the ability to detect SDC regression faults with virtual tests earlier. Our approach, called SDC-Prioritizer, prioritizes virtual tests for SDCs according to static features of the roads we designed to be used within the driving scenarios. These features can be collected without running the tests, which means that they do not require past execution results. We introduce two evolutionary approaches to prioritize the test cases using diversity metrics (black-box heuristics) computed on these static features. These two approaches, called SO-SDC-Prioritizer and MO-SDC-Prioritizer, use single-objective and multi objective genetic algorithms, respectively, to find trade-offs between executing the less expensive tests and the most diverse test cases earlier. Our empirical study conducted in the SDC domain shows that MO-SDC-Prioritizer significantly (p-value<= 0.1 − 10) improves the ability to detect safety-critical failures at the same level of execution time compared to baselines: random and greedy-based test case orderings. Besides, our study indicates that multi-objective meta-heuristics outperform single-objective approaches when prioritizing simulation-based tests for SDCs. MO-SDC-Prioritizer prioritizes test cases with a large improvement in fault detection while its overhead (up to 0.45% of the test execution cost) is negligible